A wavelet-domain PSOLA approach
نویسندگان
چکیده
A basic problem in concatenative speech synthesis are discontinuities at the concatenation points. The units which are produced by different (independent) articulatory movements differ in their spectral characteristics even if their phonetic context is carefully chosen. This paper describes a wavelet transform of the spectrum of the speech concatenated within the PSOLA algorithm. This multiresolution analysis separates the following perceptive important spectral characteristica: the intrinsic pitch resulting in a fine-ripple of the spectrum, articulatory movements typically resulting in formant-structures and the global spectral tilt. In the wavelet domain each of this characteristica can be analysed and manipulated separately in a consistent and completely non paramtric way. Optimised concationation points can easly be located. Remaining spectral irregularities can be adjusted efficiently, resulting in clear and naturally sounding synthetic speech. Dyadic filter banks are a computational efficient implementation of the presented transform.
منابع مشابه
Concatenating syllables for response generation in spoken language applications
We describe our approach in developing a speech synthesis technique for response generation in domain-specific spoken language applications. Our approach handles two Chinese dialects – Cantonese and Putonghua. We chose the foreign exchange domain, and worked with its constrained vocabulary and response expressions. The syllable is selected to be our basic unit for concatenation. Each unit label...
متن کاملProposing New Methods to Enhance the Low-Resolution Simulated GPR Responses in the Frequency and Wavelet Domains
To date, a number of numerical methods, including the popular Finite-Difference Time Domain (FDTD) technique, have been proposed to simulate Ground-Penetrating Radar (GPR) responses. Despite having a number of advantages, the finite-difference method also has pitfalls such as being very time consuming in simulating the most common case of media with high dielectric permittivity, causing the for...
متن کاملPitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones
We review in a common framework several algorithms that have been proposed recently, in order to improve the voice quality of a text-to-speech synthesis based on acoustical units concatenation (Charpentier and Moulines, 1988; Moulines and Charpentier, 1988; Hamon et al., 1989). These algorithms rely on a pitch-synchronous overlap-add (PSOLA) approach for modifying the speech prosody and concate...
متن کاملEconomic growth and energy consumption: New evidence using continuous wavelet
This study examines the relationship between energy consumption and economic growth in the Iranian economy. To that end, a wavelet transformation technique is used, which allows us to combine the time domain and frequency domain characteristics of two time series together. Using this approach and data for the period 1991:2 – 2017:1, this study tries to overcome the shortcomings of standard eco...
متن کاملPitch control in diphone synthesis
|A hybrid time domain and LPC approach to speech pitch control is developed. This approach uses a low order LPC analysis and residual exci-tation to alter pitch period length during voiced speech. This approach diiers from standard residual excited LPC in that LP reconstruction is applied only during voiced segments. Listening tests were used to compare PSOLA and the hybrid method under conditi...
متن کامل